https://rss.onlinelibrary.wiley.com/doi/10.1111/j.1740-9713.2018.01169.x
http://www.youtube.com/watch?v=XcBLEVknqvY
https://www.rstudio.com/products/rstudio/download/
https://moderndive.com/2-getting-started.html
https://cran.r-project.org/web/packages/addinslist/README.html
https://rstudio.github.io/rstudioaddins/
Skipping install of 'addinexamples' from a github remote, the SHA1 (fae96091) has not changed since last install.
Use `force = TRUE` to force installation
I love the #rstats community.
— Frank Elavsky ᴰᵃᵗᵃ ᵂᶦᶻᵃʳᵈ (@Frankly_Data) July 3, 2018
Someone is like, “oh hey peeps, I saw a big need for this mundane but difficult task that I infrequently do, so I created a package that will literally scrape the last bits of peanut butter out of the jar for you. It's called pbplyr.”
What a tribe.
https://blog.mitchelloharawild.com/blog/user-2018-feature-wall/
Available CRAN Packages By Name https://cran.r-project.org/web/packages/available_packages_by_name.html
Bioconductor https://www.bioconductor.org
RDocumentation https://www.rdocumentation.org
R Package Documentation https://rdrr.io/
GitHub
Stackoverflow
How I use #rstats
— Emily Bovee (@ebovee09) August 10, 2018
h/t @ThePracticalDev pic.twitter.com/erRnTG0Ujr
http://cran.r-project.org/doc/contrib/Baggott-refcard-v2.pdf
https://www.rstudio.com/resources/cheatsheets/
https://github.com/qinwf/awesome-R#readme
https://twitter.com/hashtag/rstats?src=hash
install.packages("tidyverse", dependencies = TRUE)
install.packages("jmv", dependencies = TRUE)
install.packages("questionr", dependencies = TRUE)
install.packages("Rcmdr", dependencies = TRUE)
install.packages("summarytools")
# install.packages("tidyverse", dependencies = TRUE)
# install.packages("jmv", dependencies = TRUE)
# install.packages("questionr", dependencies = TRUE)
# install.packages("Rcmdr", dependencies = TRUE)
# install.packages("summarytools")https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects
https://support.rstudio.com/hc/en-us/articles/218611977-Importing-Data-with-RStudio
Spreadsheet users using #rstats: where's the data?#rstats users using spreadsheets: where's the code?
— Leonard Kiefer (@lenkiefer) July 7, 2018
year month day
Min. :2013 Min. : 1.000 Min. : 1.00
1st Qu.:2013 1st Qu.: 4.000 1st Qu.: 8.00
Median :2013 Median : 7.000 Median :16.00
Mean :2013 Mean : 6.549 Mean :15.71
3rd Qu.:2013 3rd Qu.:10.000 3rd Qu.:23.00
Max. :2013 Max. :12.000 Max. :31.00
dep_time sched_dep_time dep_delay
Min. : 1 Min. : 106 Min. : -43.00
1st Qu.: 907 1st Qu.: 906 1st Qu.: -5.00
Median :1401 Median :1359 Median : -2.00
Mean :1349 Mean :1344 Mean : 12.64
3rd Qu.:1744 3rd Qu.:1729 3rd Qu.: 11.00
Max. :2400 Max. :2359 Max. :1301.00
NA's :8255 NA's :8255
arr_time sched_arr_time arr_delay
Min. : 1 Min. : 1 Min. : -86.000
1st Qu.:1104 1st Qu.:1124 1st Qu.: -17.000
Median :1535 Median :1556 Median : -5.000
Mean :1502 Mean :1536 Mean : 6.895
3rd Qu.:1940 3rd Qu.:1945 3rd Qu.: 14.000
Max. :2400 Max. :2359 Max. :1272.000
NA's :8713 NA's :9430
carrier flight tailnum
Length:336776 Min. : 1 Length:336776
Class :character 1st Qu.: 553 Class :character
Mode :character Median :1496 Mode :character
Mean :1972
3rd Qu.:3465
Max. :8500
origin dest air_time
Length:336776 Length:336776 Min. : 20.0
Class :character Class :character 1st Qu.: 82.0
Mode :character Mode :character Median :129.0
Mean :150.7
3rd Qu.:192.0
Max. :695.0
NA's :9430
distance hour minute
Min. : 17 Min. : 1.00 Min. : 0.00
1st Qu.: 502 1st Qu.: 9.00 1st Qu.: 8.00
Median : 872 Median :13.00 Median :29.00
Mean :1040 Mean :13.18 Mean :26.23
3rd Qu.:1389 3rd Qu.:17.00 3rd Qu.:44.00
Max. :4983 Max. :23.00 Max. :59.00
time_hour
Min. :2013-01-01 05:00:00
1st Qu.:2013-04-04 13:00:00
Median :2013-07-03 10:00:00
Mean :2013-07-03 05:22:54
3rd Qu.:2013-10-01 07:00:00
Max. :2013-12-31 23:00:00
View(data)
data
head
tail
glimpse
str
skimr::skim()
questionr paketi kullanılacak
https://juba.github.io/questionr/articles/recoding_addins.html
summary()
mean
median
min
max
sd
table()
Parsed with column specification:
cols(
Sepal.Length = col_double(),
Sepal.Width = col_double(),
Petal.Length = col_double(),
Petal.Width = col_double(),
Species = col_character()
)
jmv::descriptives(
data = irisdata,
vars = "Sepal.Length",
splitBy = "Species",
freq = TRUE,
hist = TRUE,
dens = TRUE,
bar = TRUE,
box = TRUE,
violin = TRUE,
dot = TRUE,
mode = TRUE,
sum = TRUE,
sd = TRUE,
variance = TRUE,
range = TRUE,
se = TRUE,
skew = TRUE,
kurt = TRUE,
quart = TRUE,
pcEqGr = TRUE)
DESCRIPTIVES
Descriptives
─────────────────────────────────────────────────────
Species Sepal.Length
─────────────────────────────────────────────────────
N setosa 50
versicolor 50
virginica 50
Missing setosa 0
versicolor 0
virginica 0
Mean setosa 5.01
versicolor 5.94
virginica 6.59
Std. error mean setosa 0.0498
versicolor 0.0730
virginica 0.0899
Median setosa 5.00
versicolor 5.90
virginica 6.50
Mode setosa 5.00
versicolor 5.50
virginica 6.30
Sum setosa 250
versicolor 297
virginica 329
Standard deviation setosa 0.352
versicolor 0.516
virginica 0.636
Variance setosa 0.124
versicolor 0.266
virginica 0.404
Range setosa 1.50
versicolor 2.10
virginica 3.00
Minimum setosa 4.30
versicolor 4.90
virginica 4.90
Maximum setosa 5.80
versicolor 7.00
virginica 7.90
Skewness setosa 0.120
versicolor 0.105
virginica 0.118
Std. error skewness setosa 0.337
versicolor 0.337
virginica 0.337
Kurtosis setosa -0.253
versicolor -0.533
virginica 0.0329
Std. error kurtosis setosa 0.662
versicolor 0.662
virginica 0.662
25th percentile setosa 4.80
versicolor 5.60
virginica 6.23
50th percentile setosa 5.00
versicolor 5.90
virginica 6.50
75th percentile setosa 5.20
versicolor 6.30
virginica 6.90
─────────────────────────────────────────────────────
# install.packages("scatr")
scatr::scat(
data = irisdata,
x = "Sepal.Length",
y = "Sepal.Width",
group = "Species",
marg = "dens",
line = "linear",
se = TRUE)https://cran.r-project.org/web/packages/summarytools/vignettes/Introduction.html
Variable: iris$Species
Type: Factor (unordered)
| Freq | % Valid | % Valid Cum. | % Total | % Total Cum. | |
|---|---|---|---|---|---|
| setosa | 50 | 33.33 | 33.33 | 33.33 | 33.33 |
| versicolor | 50 | 33.33 | 66.67 | 33.33 | 66.67 |
| virginica | 50 | 33.33 | 100.00 | 33.33 | 100.00 |
| <NA> | 0 | 0.00 | 100.00 | ||
| Total | 150 | 100.00 | 100.00 | 100.00 | 100.00 |
| Freq | % | % Cum. | |
|---|---|---|---|
| setosa | 50 | 33.33 | 33.33 |
| versicolor | 50 | 33.33 | 66.67 |
| virginica | 50 | 33.33 | 100.00 |
| Total | 150 | 100.00 | 100.00 |
| diseased | |||
|---|---|---|---|
| smoker | Yes | No | Total |
| Yes | 125 (41.95%) | 173 (58.05%) | 298 (100.00%) |
| No | 99 (14.10%) | 603 (85.90%) | 702 (100.00%) |
| Total | 224 (22.40%) | 776 (77.60%) | 1000 (100.00%) |
Generated by summarytools 0.8.5 (R version 3.5.1)
2018-07-21
with(tobacco,
print(ctable(smoker, diseased, prop = 'n', totals = FALSE),
omit.headings = TRUE, method = "render"))| diseased | ||
|---|---|---|
| smoker | Yes | No |
| Yes | 125 | 173 |
| No | 99 | 603 |
Generated by summarytools 0.8.5 (R version 3.5.1)
2018-07-21
Non-numerical variable(s) ignored: Species ### Descriptive Statistics
Data Frame: iris
N: 150
| Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | |
|---|---|---|---|---|
| Mean | 5.84 | 3.06 | 3.76 | 1.20 |
| Std.Dev | 0.83 | 0.44 | 1.77 | 0.76 |
| Min | 4.30 | 2.00 | 1.00 | 0.10 |
| Q1 | 5.10 | 2.80 | 1.60 | 0.30 |
| Median | 5.80 | 3.00 | 4.35 | 1.30 |
| Q3 | 6.40 | 3.30 | 5.10 | 1.80 |
| Max | 7.90 | 4.40 | 6.90 | 2.50 |
| MAD | 1.04 | 0.44 | 1.85 | 1.04 |
| IQR | 1.30 | 0.50 | 3.50 | 1.50 |
| CV | 7.06 | 7.01 | 2.13 | 1.57 |
| Skewness | 0.31 | 0.31 | -0.27 | -0.10 |
| SE.Skewness | 0.20 | 0.20 | 0.20 | 0.20 |
| Kurtosis | -0.61 | 0.14 | -1.42 | -1.36 |
| N.Valid | 150.00 | 150.00 | 150.00 | 150.00 |
| Pct.Valid | 100.00 | 100.00 | 100.00 | 100.00 |
descr(iris, stats = c("mean", "sd", "min", "med", "max"), transpose = TRUE,
omit.headings = TRUE, style = "rmarkdown")Non-numerical variable(s) ignored: Species
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Sepal.Length | 5.84 | 0.83 | 4.30 | 5.80 | 7.90 |
| Sepal.Width | 3.06 | 0.44 | 2.00 | 3.00 | 4.40 |
| Petal.Length | 3.76 | 1.77 | 1.00 | 4.35 | 6.90 |
| Petal.Width | 1.20 | 0.76 | 0.10 | 1.30 | 2.50 |
tobacco
N: 1000
| No | Variable | Stats / Values | Freqs (% of Valid) | Text Graph | Valid | Missing |
|---|---|---|---|---|---|---|
| 1 | gender [factor] |
1. F 2. M |
489 (50.0%) 489 (50.0%) |
IIIIIIIIIIIIIIII IIIIIIIIIIIIIIII |
978 (97.8%) |
22 (2.2%) |
| 2 | age [numeric] |
mean (sd) : 49.6 (18.29) min < med < max : 18 < 50 < 80 IQR (CV) : 32 (0.37) |
63 distinct val. | 975 (97.5%) |
25 (2.5%) |
|
| 3 | age.gr [factor] |
1. 18-34 2. 35-50 3. 51-70 4. 71 + |
258 (26.5%) 241 (24.7%) 317 (32.5%) 159 (16.3%) |
IIIIIIIIIIIII IIIIIIIIIIII IIIIIIIIIIIIIIII IIIIIIII |
975 (97.5%) |
25 (2.5%) |
| 4 | BMI [numeric] |
mean (sd) : 25.73 (4.49) min < med < max : 8.83 < 25.62 < 39.44 IQR (CV) : 5.72 (0.17) |
974 distinct val. | 974 (97.4%) |
26 (2.6%) |
|
| 5 | smoker [factor] |
1. Yes 2. No |
298 (29.8%) 702 (70.2%) |
IIIIII IIIIIIIIIIIIIIII |
1000 (100%) |
0 (0%) |
| 6 | cigs.per.day [numeric] |
mean (sd) : 6.78 (11.88) min < med < max : 0 < 0 < 40 IQR (CV) : 11 (1.75) |
37 distinct val. | 965 (96.5%) |
35 (3.5%) |
|
| 7 | diseased [factor] |
1. Yes 2. No |
224 (22.4%) 776 (77.6%) |
IIII IIIIIIIIIIIIIIII |
1000 (100%) |
0 (0%) |
| 8 | disease [character] |
1. Hypertension 2. Cancer 3. Cholesterol 4. Heart 5. Pulmonary 6. Musculoskeletal 7. Diabetes 8. Hearing 9. Digestive 10. Hypotension [ 3 others ] |
36 (16.2%) 34 (15.3%) 21 ( 9.5%) 20 ( 9.0%) 20 ( 9.0%) 19 ( 8.6%) 14 ( 6.3%) 14 ( 6.3%) 12 ( 5.4%) 11 ( 5.0%) 21 ( 9.4%) |
IIIIIIIIIIIIIIII IIIIIIIIIIIIIII IIIIIIIII IIIIIIII IIIIIIII IIIIIIII IIIIII IIIIII IIIII IIII IIIIIIIII |
222 (22.2%) |
778 (77.8%) |
| 9 | samp.wgts [numeric] |
mean (sd) : 1 (0.08) min < med < max : 0.86 < 1.04 < 1.06 IQR (CV) : 0.19 (0.08) |
0.86!: 267 (26.7%) 1.04!: 249 (24.9%) 1.05!: 324 (32.4%) 1.06!: 160 (16.0%) ! rounded |
IIIIIIIIIIIII IIIIIIIIIIII IIIIIIIIIIIIIIII IIIIIII |
1000 (100%) |
0 (0%) |
Data Frame: iris
Group: Species = setosa
N: 50
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Sepal.Length | 5.01 | 0.35 | 4.30 | 5.00 | 5.80 |
| Sepal.Width | 3.43 | 0.38 | 2.30 | 3.40 | 4.40 |
| Petal.Length | 1.46 | 0.17 | 1.00 | 1.50 | 1.90 |
| Petal.Width | 0.25 | 0.11 | 0.10 | 0.20 | 0.60 |
Group: Species = versicolor
N: 50
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Sepal.Length | 5.94 | 0.52 | 4.90 | 5.90 | 7.00 |
| Sepal.Width | 2.77 | 0.31 | 2.00 | 2.80 | 3.40 |
| Petal.Length | 4.26 | 0.47 | 3.00 | 4.35 | 5.10 |
| Petal.Width | 1.33 | 0.20 | 1.00 | 1.30 | 1.80 |
Group: Species = virginica
N: 50
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Sepal.Length | 6.59 | 0.64 | 4.90 | 6.50 | 7.90 |
| Sepal.Width | 2.97 | 0.32 | 2.20 | 3.00 | 3.80 |
| Petal.Length | 5.55 | 0.55 | 4.50 | 5.55 | 6.90 |
| Petal.Width | 2.03 | 0.27 | 1.40 | 2.00 | 2.50 |
Output file written: /var/folders/76/rq_s_23s7fd5r8hqrbg8rmnc0000gp/T//RtmpxIdFKz/file8d943e622540.html
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Sepal.Length | 5.94 | 0.52 | 4.90 | 5.90 | 7.00 |
| Sepal.Width | 2.77 | 0.31 | 2.00 | 2.80 | 3.40 |
| Petal.Length | 4.26 | 0.47 | 3.00 | 4.35 | 5.10 |
| Petal.Width | 1.33 | 0.20 | 1.00 | 1.30 | 1.80 |
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Sepal.Length | 6.59 | 0.64 | 4.90 | 6.50 | 7.90 |
| Sepal.Width | 2.97 | 0.32 | 2.20 | 3.00 | 3.80 |
| Petal.Length | 5.55 | 0.55 | 4.50 | 5.55 | 6.90 |
| Petal.Width | 2.03 | 0.27 | 1.40 | 2.00 | 2.50 |
Generated by summarytools 0.8.6 (R version 3.5.1)
2018-07-21
Variable: tobacco$BMI by age.gr
| 18-34 | 35-50 | 51-70 | 71 + | |
|---|---|---|---|---|
| Mean | 23.84 | 25.11 | 26.91 | 27.45 |
| Std.Dev | 4.23 | 4.34 | 4.26 | 4.37 |
| Min | 8.83 | 10.35 | 9.01 | 16.36 |
| Median | 24.04 | 25.11 | 26.77 | 27.52 |
| Max | 34.84 | 39.44 | 39.21 | 38.37 |
BMI_by_age <- with(tobacco,
by(BMI, age.gr, descr, transpose = TRUE,
stats = c("mean", "sd", "min", "med", "max")))
view(BMI_by_age, "pander", style = "rmarkdown", omit.headings = TRUE)| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| 18-34 | 23.84 | 4.23 | 8.83 | 24.04 | 34.84 |
| 35-50 | 25.11 | 4.34 | 10.35 | 25.11 | 39.44 |
| 51-70 | 26.91 | 4.26 | 9.01 | 26.77 | 39.21 |
| 71 + | 27.45 | 4.37 | 16.36 | 27.52 | 38.37 |
tobacco_subset <- tobacco[ ,c("gender", "age.gr", "smoker")]
freq_tables <- lapply(tobacco_subset, freq)
view(freq_tables, footnote = NA, file = 'freq-tables.html')Output file written: freq-tables.html
| Valid | Total | ||||
|---|---|---|---|---|---|
| age.gr | Freq | % | % Cumul | % | % Cumul |
| 18-34 | 258 | 26.46 | 26.46 | 25.80 | 25.80 |
| 35-50 | 241 | 24.72 | 51.18 | 24.10 | 49.90 |
| 51-70 | 317 | 32.51 | 83.69 | 31.70 | 81.60 |
| 71 + | 159 | 16.31 | 100.00 | 15.90 | 97.50 |
| <NA> | 25 | 2.50 | 100.00 | ||
| Total | 1000 | 100.00 | 100.00 | 100.00 | 100.00 |
| Valid | Total | ||||
|---|---|---|---|---|---|
| smoker | Freq | % | % Cumul | % | % Cumul |
| Yes | 298 | 29.80 | 29.80 | 29.80 | 29.80 |
| No | 702 | 70.20 | 100.00 | 70.20 | 100.00 |
| <NA> | 0 | 0.00 | 100.00 | ||
| Total | 1000 | 100.00 | 100.00 | 100.00 | 100.00 |
function ‘is’ appears not to be S3 generic; found functions that look like S3 methods‘>=’ not meaningful for factors$properties
$attributes.lengths names class row.names 5 1 150
$extensive.is [1] “is.data.frame” “is.list”
[3] “is.object” “is.recursive” [5] “is.unsorted”
### Frequencies
**Variable:** tobacco$gender
**Type:** Factor (unordered)
| | Freq | % Valid | % Valid Cum. | % Total | % Total Cum. |
|-----------:|-----:|--------:|-------------:|--------:|-------------:|
| **F** | 489 | 50.00 | 50.00 | 48.90 | 48.90 |
| **M** | 489 | 50.00 | 100.00 | 48.90 | 97.80 |
| **\<NA\>** | 22 | | | 2.20 | 100.00 |
| **Total** | 1000 | 100.00 | 100.00 | 100.00 | 100.00 |
| Valid | Total | ||||
|---|---|---|---|---|---|
| gender | Freq | % | % Cumul | % | % Cumul |
| F | 489 | 50.00 | 50.00 | 48.90 | 48.90 |
| M | 489 | 50.00 | 100.00 | 48.90 | 97.80 |
| <NA> | 22 | 2.20 | 100.00 | ||
| Total | 1000 | 100.00 | 100.00 | 100.00 | 100.00 |
Generated by summarytools 0.8.6 (R version 3.5.1)
2018-07-21
Zorunlu paket yükleniyor: lattice
Zorunlu paket yükleniyor: ggformula
New to ggformula? Try the tutorials:
learnr::run_tutorial("introduction", package = "ggformula")
learnr::run_tutorial("refining", package = "ggformula")
Zorunlu paket yükleniyor: mosaicData
Zorunlu paket yükleniyor: Matrix
Attaching package: ‘Matrix’
The following object is masked from ‘package:tidyr’:
expand
The 'mosaic' package masks several functions from core packages in order to add
additional features. The original behavior of these functions should not be affected by this.
Note: If you use the Matrix package, be sure to load it BEFORE loading mosaic.
Attaching package: ‘mosaic’
The following object is masked from ‘package:Matrix’:
mean
The following object is masked from ‘package:questionr’:
prop
The following objects are masked from ‘package:dplyr’:
count, do, tally
The following object is masked from ‘package:purrr’:
cross
The following object is masked from ‘package:ggplot2’:
stat
The following objects are masked from ‘package:stats’:
binom.test, cor, cor.test, cov, fivenum, IQR,
median, prop.test, quantile, sd, t.test, var
The following objects are masked from ‘package:base’:
max, mean, min, prod, range, sample, sum
Choose a plot type.
1: 1-variable (histogram, density plot, etc.)
2: 2-variable (scatter, boxplot, etc.)
3: map
### Cross-Tabulation / Row Proportions
**Variables:** gender * smoker
**Data Frame:** tobacco
| | | | | |
|-------:|-------:|-------------:|-------------:|---------------:|
| | smoker | Yes | No | Total |
| gender | | | | |
| F | | 147 (30.06%) | 342 (69.94%) | 489 (100.00%) |
| M | | 143 (29.24%) | 346 (70.76%) | 489 (100.00%) |
| \<NA\> | | 8 (36.36%) | 14 (63.64%) | 22 (100.00%) |
| Total | | 298 (29.80%) | 702 (70.20%) | 1000 (100.00%) |
descr(tobacco, style = ‘rmarkdown’)
print(descr(tobacco), method = ‘render’, table.classes = ‘st-small’)
dfSummary(tobacco, style = ‘grid’, plain.ascii = FALSE)
print(dfSummary(tobacco, graph.magnif = 0.75), method = ‘render’)
Here, building up a #ggplot2 as slowly as possible, #rstats. Incremental adjustments. #rstatsteachingideas pic.twitter.com/nUulQl8bPh
— Gina Reynolds (@EvaMaeRey) August 13, 2018
Dreaming of a fancy #Rstats #ggplot #dataviz but still scared of typing #code? @_pvictorr esquisse package has you covered https://t.co/1vIDXcVAAF pic.twitter.com/RlTkptnrNv
— Radoslaw Panczak (@RPanczak) October 2, 2018
library(Rcmdr)
Rcmdr::Commander()
http://r4stats.com/articles/software-reviews/r-commander/
Bu bir derlemedir, mümkün mertebe alıntılara referans vermeye çalıştım.↩